29 research outputs found
Batch weight for domain adaptation with mass shift
Unsupervised domain transfer is the task of transferring or translating
samples from a source distribution to a different target distribution. Current
solutions unsupervised domain transfer often operate on data on which the modes
of the distribution are well-matched, for instance have the same frequencies of
classes between source and target distributions. However, these models do not
perform well when the modes are not well-matched, as would be the case when
samples are drawn independently from two different, but related, domains. This
mode imbalance is problematic as generative adversarial networks (GANs), a
successful approach in this setting, are sensitive to mode frequency, which
results in a mismatch of semantics between source samples and generated samples
of the target distribution. We propose a principled method of re-weighting
training samples to correct for such mass shift between the transferred
distributions, which we call batch-weight. We also provide rigorous
probabilistic setting for domain transfer and new simplified objective for
training transfer networks, an alternative to complex, multi-component loss
functions used in the current state-of-the art image-to-image translation
models. The new objective stems from the discrimination of joint distributions
and enforces cycle-consistency in an abstract, high-level, rather than
pixel-wise, sense. Lastly, we experimentally show the effectiveness of the
proposed methods in several image-to-image translation tasks
Variational Autoencoders for Feature Detection of Magnetic Resonance Imaging Data
Independent component analysis (ICA), as an approach to the blind
source-separation (BSS) problem, has become the de-facto standard in many
medical imaging settings. Despite successes and a large ongoing research
effort, the limitation of ICA to square linear transformations have not been
overcome, so that general INFOMAX is still far from being realized. As an
alternative, we present feature analysis in medical imaging as a problem solved
by Helmholtz machines, which include dimensionality reduction and
reconstruction of the raw data under the same objective, and which recently
have overcome major difficulties in inference and learning with deep and
nonlinear configurations. We demonstrate one approach to training Helmholtz
machines, variational auto-encoders (VAE), as a viable approach toward feature
extraction with magnetic resonance imaging (MRI) data
Leveraging exploration in off-policy algorithms via normalizing flows
The ability to discover approximately optimal policies in domains with sparse
rewards is crucial to applying reinforcement learning (RL) in many real-world
scenarios. Approaches such as neural density models and continuous exploration
(e.g., Go-Explore) have been proposed to maintain the high exploration rate
necessary to find high performing and generalizable policies. Soft
actor-critic(SAC) is another method for improving exploration that aims to
combine efficient learning via off-policy updates while maximizing the policy
entropy. In this work, we extend SAC to a richer class of probability
distributions (e.g., multimodal) through normalizing flows (NF) and show that
this significantly improves performance by accelerating the discovery of good
policies while using much smaller policy representations. Our approach, which
we call SAC-NF, is a simple, efficient,easy-to-implement modification and
improvement to SAC on continuous control baselines such as MuJoCo and PyBullet
Roboschool domains. Finally, SAC-NF does this while being significantly
parameter efficient, using as few as 5.5% the parameters for an equivalent SAC
model.Comment: Accepted to 3rd Conference on Robot Learning (CoRL 2019); Keywords:
Exploration, soft actor-critic, normalizing flow, off-policy; maximum
entropy, reinforcement learning; deceptive reward; sparse reward; inverse
autoregressive flo
Deep learning for neuroimaging: a validation study
Deep learning methods have recently made notable advances in the tasks of
classification and representation learning. These tasks are important for brain
imaging and neuroscience discovery, making the methods attractive for porting
to a neuroimager's toolbox. Success of these methods is, in part, explained by
the flexibility of deep learning models. However, this flexibility makes the
process of porting to new areas a difficult parameter optimization problem. In
this work we demonstrate our results (and feasible parameter ranges) in
application of deep learning methods to structural and functional brain imaging
data. We also describe a novel constraint-based approach to visualizing high
dimensional data. We use it to analyze the effect of parameter choices on data
transformations. Our results show that deep learning methods are able to learn
physiologically important representations and detect latent relations in
neuroimaging data.Comment: ICLR 2014 revision
Variance Regularizing Adversarial Learning
We introduce a novel approach for training adversarial models by replacing
the discriminator score with a bi-modal Gaussian distribution over the
real/fake indicator variables. In order to do this, we train the Gaussian
classifier to match the target bi-modal distribution implicitly through
meta-adversarial training. We hypothesize that this approach ensures a non-zero
gradient to the generator, even in the limit of a perfect classifier. We test
our method against standard benchmark image datasets as well as show the
classifier output distribution is smooth and has overlap between the real and
fake modes.Comment: Method is out of date and some results are incorrec
Learning Representations by Maximizing Mutual Information Across Views
We propose an approach to self-supervised representation learning based on
maximizing mutual information between features extracted from multiple views of
a shared context. For example, one could produce multiple views of a local
spatio-temporal context by observing it from different locations (e.g., camera
positions within a scene), and via different modalities (e.g., tactile,
auditory, or visual). Or, an ImageNet image could provide a context from which
one produces multiple views by repeatedly applying data augmentation.
Maximizing mutual information between features extracted from these views
requires capturing information about high-level factors whose influence spans
multiple views -- e.g., presence of certain objects or occurrence of certain
events.
Following our proposed approach, we develop a model which learns image
representations that significantly outperform prior methods on the tasks we
consider. Most notably, using self-supervised learning, our model learns
representations which achieve 68.1% accuracy on ImageNet using standard linear
evaluation. This beats prior results by over 12% and concurrent results by 7%.
When we extend our model to use mixture-based representations, segmentation
behaviour emerges as a natural side-effect. Our code is available online:
https://github.com/Philip-Bachman/amdim-public
Iterative Refinement of the Approximate Posterior for Directed Belief Networks
Variational methods that rely on a recognition network to approximate the
posterior of directed graphical models offer better inference and learning than
previous methods. Recent advances that exploit the capacity and flexibility in
this approach have expanded what kinds of models can be trained. However, as a
proposal for the posterior, the capacity of the recognition network is limited,
which can constrain the representational power of the generative model and
increase the variance of Monte Carlo estimates. To address these issues, we
introduce an iterative refinement procedure for improving the approximate
posterior of the recognition network and show that training with the refined
posterior is competitive with state-of-the-art methods. The advantages of
refinement are further evident in an increased effective sample size, which
implies a lower variance of gradient estimates
Boundary-Seeking Generative Adversarial Networks
Generative adversarial networks (GANs) are a learning framework that rely on
training a discriminator to estimate a measure of difference between a target
and generated distributions. GANs, as normally formulated, rely on the
generated samples being completely differentiable w.r.t. the generative
parameters, and thus do not work for discrete data. We introduce a method for
training GANs with discrete data that uses the estimated difference measure
from the discriminator to compute importance weights for generated samples,
thus providing a policy gradient for training the generator. The importance
weights have a strong connection to the decision boundary of the discriminator,
and we call our method boundary-seeking GANs (BGANs). We demonstrate the
effectiveness of the proposed algorithm with discrete image and character-based
natural language generation. In addition, the boundary-seeking objective
extends to continuous data, which can be used to improve stability of training,
and we demonstrate this on Celeba, Large-scale Scene Understanding (LSUN)
bedrooms, and Imagenet without conditioning
MINE: Mutual Information Neural Estimation
We argue that the estimation of mutual information between high dimensional
continuous random variables can be achieved by gradient descent over neural
networks. We present a Mutual Information Neural Estimator (MINE) that is
linearly scalable in dimensionality as well as in sample size, trainable
through back-prop, and strongly consistent. We present a handful of
applications on which MINE can be used to minimize or maximize mutual
information. We apply MINE to improve adversarially trained generative models.
We also use MINE to implement Information Bottleneck, applying it to supervised
classification; our results demonstrate substantial improvement in flexibility
and performance in these settings.Comment: 19 pages, 6 figure
Prediction of Progression to Alzheimer's disease with Deep InfoMax
Arguably, unsupervised learning plays a crucial role in the majority of
algorithms for processing brain imaging. A recently introduced unsupervised
approach Deep InfoMax (DIM) is a promising tool for exploring brain structure
in a flexible non-linear way. In this paper, we investigate the use of variants
of DIM in a setting of progression to Alzheimer's disease in comparison with
supervised AlexNet and ResNet inspired convolutional neural networks. As a
benchmark, we use a classification task between four groups: patients with
stable, and progressive mild cognitive impairment (MCI), with Alzheimer's
disease, and healthy controls. Our dataset is comprised of 828 subjects from
the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Our
experiments highlight encouraging evidence of the high potential utility of DIM
in future neuroimaging studies.Comment: Accepted to 2019 IEEE Biomedical and Health Informatics (BHI) as a
conference pape